Use of Measured Aerosol Optical Depth and Precipitable Water to Model Clear Sky Irradiance

presented at: 2017 IEEE PVSC-44 by: Mark A. Mikofski, Clifford W. Hansen, William F. Holmgren and Gregory M. Kimball

Introduction

Predicting irradiance is crucial for developing solar power systems. Clear sky models are used to predict irradiance based on solar position and atmospheric data. This paper compares clear sky predictions using atmospheric data from ECMWF with irradiance measurments from SURFRAD. This notebook presents the analysis for the paper and presentation given by the authors at the 2017 IEEE PVSC-44 in Washingtion DC June 25-30th.

How to use this Jupyter notebook

This document is a Jupyter notebook. It can be used several different ways. You can view the notebook as a static HTML document. You can also clone the repository from GitHub using Git, install the requirements in a Python virtual environment and run the notebook interactively using Python.

PVLIB-Python

This analysis uses Sandia National Laboratory's PVLIB-Python software extensively. PVLIB-Python is a library of functions for modeling photovoltaic devices, irradiance and atmospheric conditions. The documentation for PVLIB-Python is online and there is more information at the Sandia PV Performance Model Collaborative.


In [1]:
# imports and settings
import os

import h5py
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import pvlib
import seaborn as sns
import statsmodels.api as sm

from pvsc44_clearsky_aod import ecmwf_macc_tools

%matplotlib inline

sns.set_context('notebook', rc={'figure.figsize': (8, 6)})
sns.set(font_scale=1.5)


c:\python27\lib\site-packages\statsmodels\compat\pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools

Data

Two sources of data are used in this analysis, SURFRAD and ECMWF, each described in the following sections.

SURFRAD

SURFRAD data from NOAA contains irradiance measurements at sub-hour increments from seven stations accross the United States. The data can be viewed online or downloaded from a NOAA FTP site ftp://aftp.cmdl.noaa.gov/data/radiation/surfrad/. Data is organized into folders for each station and yearly subfolders containing individual daily files. The README in each station's folder provides additional information such as the format of the files, types of data in each column, definitions, and abbreviations. The first two rows of each daily file contains header information about the file. The first line is the station name and the second line contains the latitude, longitude and elevation. Data is timestamped at the beginning of each interval. Starting around 2008, most sites increased sampling to 1-minute intervals, other years sampled at 3-minute intervals. The irradiance data includes direct (DNI), diffuse (DHI), and downwelling solar irradiance which is equivalent global horizontal irradiance (GHI). Units of irradiance are W/m2. Ambient temperature is given in Celsius, pressure in millibars and relative humidity in percent.

There are seven sites with data that overlaps the time span of the atmospheric data (see the next section). The following metadata was collected from the daily data files, which all contain the same header for each site, and the README.

station id UTC offset timezone station name City latitude longitude elevation
bon -6.0 US/Central Bondville Bondville, IL 40.05 -88.37 213
tbl -7.0 US/Mountain Table Mountain Boulder, CO 40.13 -105.24 1689
dra -8.0 US/Pacific Desert Rock Desert Rock, NV 36.62 -116.02 1007
fpk -7.0 US/Mountain Fort Peck Fort Peck, MT 48.31 -105.10 634
gwn -6.0 US/Central Goodwin Creek Goodwin Creek, MS 34.25 -89.87 98
psu -5.0 US/Eastern Penn State Penn State, PA 40.72 -77.93 376
sxf -6.0 US/Central Sioux Falls Sioux Falls, SD 43.73 -96.62 473

There are three additional sites, Alamosa-CO, Red Lake-AZ, Rutland-VT, and Wasco-OR, that don't have data that cover the same 10 years span of the atmospheric data. Since SURFRAD also contains some atmospheric data, perhaps we can return to examine these sites later.

The SURFRAD data we used is shared as surfrad.zip in a Public OneDrive folder. The zip file contains separate CSV files for each site and year, but only the timestamps, solar irradiances, ambient temperature, RH and pressure columns are preserved. This notebook assumes the files have been extracted to a folder in this directory called surfrad.


In [2]:
# get the "metadata" that contains the timezone, latitude, longitude, elevation and station id for SURFRAD
METADATA = pd.read_csv('metadata.csv', index_col=0)

In [3]:
# get SURFRAD data
station_id = 'bon'  # choose a station ID from the metadata list

# CONSTANTS
DATADIR = 'surfrad'  # folder where SURFRAD data is stored relative to this file
NA_VAL = '-9999.9'  # value SURFRAD uses for missing data
USECOLS = [0, 6, 7, 8, 9, 10, 11, 12]  # omit 1:year, 2:month, 3:day, 4:hour, 5:minute and 13:jday columns

# read 2003 data for the station, use first column as index, parse index as dates, and replace missing values with NaN
DATA = pd.read_csv(
    os.path.join(DATADIR, '%s03.csv' % station_id),
    index_col=0, usecols=USECOLS, na_values=NA_VAL, parse_dates=True
)

# append the data from 2003 to 2012 which coincides with the ECMWF data available
for y in xrange(2004, 2013):
    DATA = DATA.append(pd.read_csv(
        os.path.join(DATADIR, '%s%s.csv' % (station_id, str(y)[-2:])),
        index_col=0, usecols=USECOLS, na_values=NA_VAL, parse_dates=True
    ))

DATA['press'] = DATA['press'] * 100.0  # convert mbars to Pascals
DATA.index.rename('timestamps', inplace=True)  # rename the index to "timestamps"
DATA = DATA.tz_localize('UTC')  # set timezone to UTC

In [4]:
# plot some temperatures
DATA['ta']['2006-07-16T00:00:00':'2006-07-16T23:59:59'].tz_convert(METADATA['tz'][station_id]).plot()
plt.ylabel('ambient temperature $[\degree C]$')


Out[4]:
<matplotlib.text.Text at 0x261f24e0>

In [5]:
# plot some irradiance
DATA[['dni', 'dhi', 'ghi']]['2006-07-16T00:00:00':'2006-07-16T23:59:59'].tz_convert(METADATA['tz'][station_id]).plot()
plt.ylabel('irradiance $[W/m^2]$')


Out[5]:
<matplotlib.text.Text at 0x26554860>

ECMWF

The European Center for Medium Weather Forecast (ECMWF) hosts atmospheric data from Monitoring Amospheric Composition & Climate (MACC) and the EU Copernicus Atmospheric Monitoring Service (CAMS). This data set contains aerosol optical depth (AOD) measured at several wavelengths and total column water vapor (AKA: precipitable water) data in centimeters (cm) derived from ground measurments and satelite data for the entire globe from 2003 to 2012. The data can be downloaded online or from the ECMWF API using the Python API client after registering for an API key. The downloaded data used in this analysis has spatial resolution of 0.75 ° x 0.75 °. The timestamps are every 3-hours marked at the end of the measurement window - EG: each timestamp is the average over the preceeding interval. A sample script to download the data we used is included in this repo called all_ecmwf_macc.py. This data when loaded is nearly 20GB.


In [4]:
# get AOD and water vapor for the SURFRAD station
ATMOS = ecmwf_macc_tools.Atmosphere()  # get the downloaded ECMWF-MACC data for the world

# create a datetime series for the ECMWF-MACC data, it's available from 2003 to 2012 in 3 hour increment
# pandas creates timestamps at the *beginning* of the each interval,shifting the timestamps from the end
# to match the SURFRAD data
TIMES = pd.DatetimeIndex(start='2003-01-01T00:00:00', freq='3H', end='2012-12-31T23:59:59').tz_localize('UTC')

# get the atmospheric data for the SURFRAD station as a pandas dataframe, and append some addition useful calculations
# like AOD at some other wavelengths (380-nm, 500-nm, and 700nm) and convert precipitable water to cm.
STATION_ATM = pd.DataFrame(
    ATMOS.get_all_data(METADATA['latitude'][station_id], METADATA['longitude'][station_id]), index=TIMES
)

In [7]:
# plot some of the atmospheric data
atm_data = STATION_ATM[['tau380', 'tau500', 'aod550', 'tau700', 'aod1240']]['2006-07-13':'2006-07-17']
atm_data['2006-07-16T00:00:00':'2006-07-16T23:59:59'].tz_convert(METADATA['tz'][station_id]).plot()
plt.title('Atmospheric Data')
plt.ylabel('aerosol optical depth')
plt.savefig('%s_AOD_2006-07-16.png' % station_id)


Solar Position, Airmass and Precipitable Water

NREL's Solar Position Algorithm (SPA) is used to find solar position including the effects of atmospheric refraction. The path length through the atmosphere or airmass is also calculated. Finally if water vapor measurements are not available, they can be estimated from relative humidity.


In [5]:
# solarposition, airmass, pressure-corrected & water vapor
SOLPOS = pvlib.solarposition.get_solarposition(
    time=DATA.index,
    latitude=METADATA['latitude'][station_id],
    longitude=METADATA['longitude'][station_id],
    altitude=METADATA['elevation'][station_id],
    pressure=DATA['press'],
    temperature=DATA['ta']
)
AIRMASS = pvlib.atmosphere.relativeairmass(SOLPOS.apparent_zenith)  # relative airmass
AM_PRESS = pvlib.atmosphere.absoluteairmass(AIRMASS, pressure=DATA['press'])  # pressure corrected airmass
PWAT_CALC = pvlib.atmosphere.gueymard94_pw(DATA['ta'], DATA['rh'])  # estimate of atmospheric water vapor in cm
ETR = pvlib.irradiance.extraradiation(DATA.index)  # extra-terrestrial total solar radiation

# append these to the SOLPOS dataframe to keep them together more easily
SOLPOS.insert(6, 'am', AIRMASS)
SOLPOS.insert(7, 'amp', AM_PRESS)
SOLPOS.insert(8, 'pwat_calc', PWAT_CALC)
SOLPOS.insert(9, 'etr', ETR)


c:\python27\lib\site-packages\pvlib\atmosphere.py:215: RuntimeWarning: invalid value encountered in power
  0.50572*(((6.07995 + (90 - z)) ** - 1.6364))))
c:\python27\lib\site-packages\pvlib\atmosphere.py:319: RuntimeWarning: invalid value encountered in maximum
  pw = np.maximum(pw, 0.1)

In [9]:
# plot solar position for a day
SOLPOS[['apparent_zenith', 'azimuth']]['2006-07-16T00:00:00':'2006-07-16T23:59:59'].tz_convert(METADATA['tz'][station_id]).plot()
plt.ylabel('solar position $[\deg]$')


Out[9]:
<matplotlib.text.Text at 0x2b2d3320>

In [10]:
# compare calculated solar position to SURFRAD
(DATA['solzen'] / SOLPOS['apparent_zenith'] - 1)['2006-07-16T00:00:00':'2006-07-16T23:59:59'].tz_convert(METADATA['tz'][station_id]).plot()


Out[10]:
<matplotlib.axes._subplots.AxesSubplot at 0x2ce3ec50>

In [6]:
# concatenate atmospheric parameters so they're the same size
# pd.concat fills missing timestamps from low frequency datasets with NaN
atm_params = pd.concat([DATA, STATION_ATM, SOLPOS], axis=1)

# then fill in NaN in atmospheric data by padding with the previous value
atm_params['alpha'].fillna(method='pad', inplace=True)
atm_params['aod1240'].fillna(method='pad', inplace=True)
atm_params['aod550'].fillna(method='pad', inplace=True)
atm_params['pwat'].fillna(method='pad', inplace=True)
atm_params['tau380'].fillna(method='pad', inplace=True)
atm_params['tau500'].fillna(method='pad', inplace=True)
atm_params['tau700'].fillna(method='pad', inplace=True)
atm_params['tcwv'].fillna(method='pad', inplace=True)

In [12]:
# compare measured and calculated precipitable water
# Pad the ECMWF-MACC data since it's at 3-hour intervals, but SURFRAD is at 1-minute or 3-minute intervals.
# Then rollup daily averages to make long term trends easier to see.
pwat = pd.concat([atm_params['pwat'], atm_params['pwat_calc']], axis=1).resample('D').mean()
pwat['2006-01-01':'2006-12-31'].tz_convert(METADATA['tz'][station_id]).plot()
plt.title('Comparison of measured and calculated $P_{wat}$.')
plt.ylabel('Precipitable Water [cm]')
plt.legend(['Measured', 'Calculated'])


Out[12]:
<matplotlib.legend.Legend at 0x2bd757b8>

Linke Turbidity

Linke turbidity is an atmospheric parameter that combines AOD and Pwat. A historical data set (c. 2003) is parameter in the Ineichen clearsky model.


In [7]:
# lookup Linke turbidity
# use 3-hour intervals to speed up lookup
LT = pvlib.clearsky.lookup_linke_turbidity(
    time=TIMES,
    latitude=METADATA['latitude'][station_id],
    longitude=METADATA['longitude'][station_id],
)

# calculate Linke turbidity using Kasten pyrheliometric formula
LT_CALC = pvlib.atmosphere.kasten96_lt(
    atm_params['amp'], atm_params['pwat'], atm_params['tau700']
)

# calculate broadband AOD using Bird & Hulstrom approximation
AOD_CALC = pvlib.atmosphere.bird_hulstrom80_aod_bb(atm_params['tau380'], atm_params['tau500'])

# recalculate Linke turbidity using Bird & Hulstrom broadband AOD
LT_AOD_CALC = pvlib.atmosphere.kasten96_lt(
    atm_params['amp'], atm_params['pwat'], AOD_CALC
)

In [8]:
# insert Linke turbidity to atmospheric parameters table
atm_params.insert(25, 'lt', LT)
atm_params.insert(26, 'lt_calc', LT_CALC)
atm_params.insert(27, 'lt_aod_calc', LT_AOD_CALC)
atm_params.insert(28, 'aod_calc', AOD_CALC)

# Linke turbidity should be continuous, it only depends on time
# fill in Linke turbidity by padding previous values
atm_params['lt'].fillna(method='pad', inplace=True)

In [15]:
# compare Bird & Hulstrom approximated broadband AOD to AOD at 700-nm
# downsample to monthly intervals to show long term trend for entire range of ECMWF-MACC data
aod_bb = atm_params[['tau700', 'aod_calc']].resample('D').mean()
aod_bb['2006-01-01':'2006-12-31'].tz_convert(METADATA['tz'][station_id]).plot()
plt.ylabel('broadband AOD')
plt.title('Comparison of calculated and measured broadband AOD')


Out[15]:
<matplotlib.text.Text at 0x2547b390>

In [62]:
# compare historic Linke turbidity to calculated
# downsample to monthly averages to show long term trends
atm_params[['lt', 'lt_calc', 'lt_aod_calc']].resample('M').mean().tz_convert(METADATA['tz'][station_id]).plot()
plt.ylabel('Linke turbidity')
plt.title('Comparison of measured and historical Linke turbidity')


Out[62]:
<matplotlib.text.Text at 0x2696a978>

In [63]:
# compare historic Linke turbidity to calculated
# downsample to monthly averages to show long term trends
lt = atm_params[['lt', 'lt_calc', 'lt_aod_calc']].resample('M').mean()
lt['2003-01-01':'2003-12-31'].tz_convert(METADATA['tz'][station_id]).plot()
plt.ylabel('Linke turbidity')
plt.title('Comparison of measured and historical Linke turbidity')
plt.savefig('%s_Linke_turbidity_2003_monthly.png' % station_id)



In [64]:
# compare historic Linke turbidity to calculated
# downsample to monthly averages to show long term trends
atm_params['lt'].tz_convert(METADATA['tz'][station_id]).groupby(lambda x: x.month).mean().plot(linewidth=5)
atm_params['lt_calc'].tz_convert(METADATA['tz'][station_id]).groupby(lambda x: x.month).mean().plot(linewidth=5)
for y in xrange(2003, 2013):
    lt = atm_params['lt_calc'][('%d-01-01' % y):('%d-12-31' % y)].tz_convert(METADATA['tz'][station_id]).groupby(lambda x: x.month).mean()
    lt.plot(linestyle=':')
plt.ylabel('Linke turbidity')
plt.xlabel('month')
plt.legend(['static', 'average', 'years'])
plt.title('Comparison of measured and historical Linke turbidity')
plt.savefig('%s_Linke_turbidity_allyears_monthly.png' % station_id)



In [17]:
# how is atmosphere changing over time?
# Calculate the linear regression of the relative difference between historical and measured Linke turbidity
lt_diff = (atm_params['lt_calc'] / atm_params['lt']).resample('A').mean() - 1.0  # yearly differences
x = np.arange(lt_diff.size)  # years
x = sm.add_constant(x)  # add y-intercept
y = lt_diff.values  # numpy array of yearly relative differences
results = sm.OLS(y, x).fit()  # fit linear regression
results.summary()  # output summary


c:\python27\lib\site-packages\scipy\stats\stats.py:1334: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=10
  "anyway, n=%i" % int(n))
Out[17]:
OLS Regression Results
Dep. Variable: y R-squared: 0.465
Model: OLS Adj. R-squared: 0.399
Method: Least Squares F-statistic: 6.966
Date: Wed, 07 Jun 2017 Prob (F-statistic): 0.0297
Time: 02:34:54 Log-Likelihood: 18.332
No. Observations: 10 AIC: -32.66
Df Residuals: 8 BIC: -32.06
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const 0.2273 0.025 8.939 0.000 0.169 0.286
x1 0.0126 0.005 2.639 0.030 0.002 0.024
Omnibus: 2.472 Durbin-Watson: 1.468
Prob(Omnibus): 0.291 Jarque-Bera (JB): 0.719
Skew: -0.649 Prob(JB): 0.698
Kurtosis: 3.205 Cond. No. 10.2

In [18]:
# plot yearly relative difference versus trendline
plt.plot(lt_diff)
plt.plot(lt_diff.index, results.fittedvalues)
plt.title('Yearly atmospheric trend between calculated and historic Linke turbidity')
plt.legend(['yearly relative differnce', 'trendline'])
plt.ylabel('relative difference')
plt.xlabel('years')
plt.savefig('%s_Linke_turbidity_trend.png' % station_id)


Clear sky models

There are several clear sky models, and they take different arguments.


In [11]:
# calculate clear sky
INEICHEN_LT = pvlib.clearsky.ineichen(
    atm_params['apparent_zenith'], atm_params['amp'], atm_params['lt'], altitude=METADATA['elevation'][station_id],
    dni_extra=atm_params['etr']
)
INEICHEN_CALC = pvlib.clearsky.ineichen(
    atm_params['apparent_zenith'], atm_params['amp'], atm_params['lt_calc'], altitude=METADATA['elevation'][station_id],
    dni_extra=atm_params['etr']
)
INEICHEN_AOD_CALC = pvlib.clearsky.ineichen(
    atm_params['apparent_zenith'], atm_params['amp'], atm_params['lt_aod_calc'], altitude=METADATA['elevation'][station_id],
    dni_extra=atm_params['etr']
)
SOLIS = pvlib.clearsky.simplified_solis(
    atm_params['apparent_elevation'],
    atm_params['tau700'],
    atm_params['pwat'],
    pressure=atm_params['press'],
    dni_extra=atm_params['etr']
)
BIRD = pvlib.clearsky.bird(
    atm_params['apparent_zenith'],
    atm_params['am'],
    atm_params['tau380'],
    atm_params['tau500'],
    atm_params['pwat'],
    pressure=atm_params['press'],
    dni_extra=atm_params['etr']
)


c:\python27\lib\site-packages\pvlib\clearsky.py:113: RuntimeWarning: invalid value encountered in maximum
  cos_zenith = np.maximum(tools.cosd(apparent_zenith), 0)
c:\python27\lib\site-packages\pvlib\clearsky.py:126: RuntimeWarning: invalid value encountered in fmax
  ghi = cg1 * dni_extra * cos_zenith * tl / tl * np.fmax(ghi, 0)
c:\python27\lib\site-packages\pvlib\clearsky.py:131: RuntimeWarning: invalid value encountered in fmax
  bnci = dni_extra * np.fmax(bnci, 0)
c:\python27\lib\site-packages\pvlib\clearsky.py:136: RuntimeWarning: invalid value encountered in fmax
  bnci_2 = ghi * np.fmin(np.fmax(bnci_2, 0), 1e20)
c:\python27\lib\site-packages\pvlib\clearsky.py:138: RuntimeWarning: invalid value encountered in minimum
  dni = np.minimum(bnci, bnci_2)
c:\python27\lib\site-packages\pvlib\clearsky.py:418: RuntimeWarning: invalid value encountered in maximum
  sin_elev = np.maximum(1.e-30, np.sin(np.radians(apparent_elevation)))

In [20]:
# plot direct normal irradiance
LEGEND = ['Ineichen-Linke', 'Ineichen-ECMWF-MACC', 'Ineichen-Bird-Hulstrom', 'SOLIS', 'BIRD', 'SURFRAD']
pd.concat([
    INEICHEN_LT['dni'], INEICHEN_CALC['dni'], INEICHEN_AOD_CALC['dni'],
    SOLIS['dni'], BIRD['dni'], atm_params['dni']
], axis=1)['2006-07-16T00:00:00':'2006-07-16T23:59:59'].resample('H').mean().tz_convert(METADATA['tz'][station_id]).plot()
plt.legend(LEGEND)
plt.title('Comparison of clear sky irradiance at %s 2003-2012' % METADATA['station name'][station_id])
plt.ylabel('$DNI [W/m^2]$')
plt.savefig('%s_DNI_2006-07-16.png' % station_id)



In [21]:
# plot diffuse irradiance
pd.concat([
    INEICHEN_LT['dhi'], INEICHEN_CALC['dhi'], INEICHEN_AOD_CALC['dhi'],
    SOLIS['dhi'], BIRD['dhi'], atm_params['dhi']
], axis=1)['2006-07-16T00:00:00':'2006-07-16T23:59:59'].resample('H').mean().tz_convert(METADATA['tz'][station_id]).plot()
plt.legend(LEGEND)
plt.title('Comparison of clear sky irradiance at %s 2003-2012' % METADATA['station name'][station_id])
plt.ylabel('$DHI [W/m^2]$')
plt.savefig('%s_DHI_2006-07-16.png' % station_id)



In [22]:
# plot global horizontal irradiance
pd.concat([
    INEICHEN_LT['ghi'], INEICHEN_CALC['ghi'], INEICHEN_AOD_CALC['ghi'],
    SOLIS['ghi'], BIRD['ghi'], atm_params['ghi']
], axis=1)['2006-07-16T00:00:00':'2006-07-16T23:59:59'].resample('H').mean().tz_convert(METADATA['tz'][station_id]).plot()
plt.legend(LEGEND)
plt.title('Comparison of clear sky irradiance at %s 2003-2012' % METADATA['station name'][station_id])
plt.ylabel('$GHI [W/m^2]$')
plt.savefig('%s_GHI_2006-07-16.png' % station_id)



In [12]:
# add clear sky calc to atmospheric parameters
atm_params.insert(29, 'solis_dhi', SOLIS['dhi'])
atm_params.insert(30, 'solis_dni', SOLIS['dni'])
atm_params.insert(31, 'solis_ghi', SOLIS['ghi'])
atm_params.insert(32, 'lt_dhi', INEICHEN_LT['dhi'])
atm_params.insert(33, 'lt_dni', INEICHEN_LT['dni'])
atm_params.insert(34, 'lt_ghi', INEICHEN_LT['ghi'])
atm_params.insert(35, 'macc_dhi', INEICHEN_CALC['dhi'])
atm_params.insert(36, 'macc_dni', INEICHEN_CALC['dni'])
atm_params.insert(37, 'macc_ghi', INEICHEN_CALC['ghi'])
atm_params.insert(38, 'bird_dhi', BIRD['dhi'])
atm_params.insert(39, 'bird_dni', BIRD['dni'])
atm_params.insert(40, 'bird_ghi', BIRD['ghi'])

# make even intervals at 1-minute and 3-minutes
atm_params_3min = atm_params.resample('3T').mean() # down using averages
atm_params_1min = atm_params.resample('T').pad() # up padding previous value

# add columns for year and month for pivot and groupby plots
atm_params_3min.insert(0, 'month', atm_params_3min.index.month)
atm_params_3min.insert(0, 'year', atm_params_3min.index.year)

Detect Clear Sky

To compare the different models and the effects of historical versus measured atmospheric data, filter out any timestamps with clouds and low irradiance using pvlib.clearsky.detect_clearsky().


In [13]:
# detect clear sky indices

# 1-minute data with 10-minute window
is_clear_1min = pvlib.clearsky.detect_clearsky(atm_params_1min['ghi'], atm_params_1min['solis_ghi'], atm_params_1min.index, 10)
# 3-minute data with 30 minute window
is_clear_3min = pvlib.clearsky.detect_clearsky(atm_params_3min['ghi'], atm_params_3min['solis_ghi'], atm_params_3min.index, 10)

# filter out low light
is_bright = atm_params_3min['ghi'] > 200


c:\python27\lib\site-packages\pvlib\clearsky.py:658: RuntimeWarning: divide by zero encountered in true_divide
  meas_slope_nstd = np.std(meas_slope, axis=0, ddof=1) / meas_mean
c:\python27\lib\site-packages\pvlib\clearsky.py:658: RuntimeWarning: invalid value encountered in true_divide
  meas_slope_nstd = np.std(meas_slope, axis=0, ddof=1) / meas_mean
c:\python27\lib\site-packages\pvlib\clearsky.py:680: RuntimeWarning: invalid value encountered in less
  c1 = np.abs(meas_mean - alpha*clear_mean) < mean_diff
c:\python27\lib\site-packages\pvlib\clearsky.py:681: RuntimeWarning: invalid value encountered in less
  c2 = np.abs(meas_max - alpha*clear_max) < max_diff
c:\python27\lib\site-packages\pvlib\clearsky.py:682: RuntimeWarning: invalid value encountered in greater
  c3 = (line_diff > lower_line_length) & (line_diff < upper_line_length)
c:\python27\lib\site-packages\pvlib\clearsky.py:682: RuntimeWarning: invalid value encountered in less
  c3 = (line_diff > lower_line_length) & (line_diff < upper_line_length)
c:\python27\lib\site-packages\pvlib\clearsky.py:683: RuntimeWarning: invalid value encountered in less
  c4 = meas_slope_nstd < var_diff
c:\python27\lib\site-packages\pvlib\clearsky.py:684: RuntimeWarning: invalid value encountered in less
  c5 = (meas_slope_max - alpha*clear_slope_max) < slope_dev

In [49]:
# save data
np_atm_params_3min_clear = atm_params_3min.loc[is_clear_3min].to_records()
np_atm_params_3min_clear.index = np.array([dt.isoformat() for dt in np_atm_params_3min_clear.index.tolist()], dtype=str)
np_atm_params_3min_clear = np_atm_params_3min_clear.astype(np.dtype(
    [('index', str, 25)]
    + [(str(n), dt) for n, dt in np_atm_params_3min_clear.dtype.descr if n != 'index']
))
with h5py.File('%s_3min_clear_atm_params.h5' % station_id, 'w') as f:
    f['data'] = np_atm_params_3min_clear

In [50]:
# check hdf5 file loads back into pandas
with h5py.File('%s_3min_clear_atm_params.h5' % station_id, 'r') as f:
    np_atm_params_3min_clear = pd.DataFrame(np.array(f['data']))
np_atm_params_3min_clear['index'] = pd.DatetimeIndex(np_atm_params_3min_clear['index'])
np_atm_params_3min_clear.set_index('index', inplace=True)
np_atm_params_3min_clear.index.rename('timestamps', inplace=True)
np_atm_params_3min_clear


Out[50]:
year month solzen ghi dni dhi ta rh press alpha ... solis_ghi lt_dhi lt_dni lt_ghi macc_dhi macc_dni macc_ghi bird_dhi bird_dni bird_ghi
timestamps
2003-06-15 02:00:00 2003 6 88.750000 0.400000 0.500000 8.300000 25.800000 31.400000 96220.000000 1.388619 ... 3.903660 7.401136 1.218962 7.421796 6.477130 0.896612 6.492327 1.200775 94.970621 2.810373
2003-06-15 02:03:00 2003 6 89.130000 -1.200000 -1.700000 7.700000 25.500000 34.700000 96230.000000 1.388619 ... 1.486703 5.903362 0.434662 5.907668 3.761194 0.154327 3.762723 0.349004 89.293288 1.233747
2003-06-15 02:06:00 2003 6 89.500000 -2.000000 -2.300000 6.800000 25.100000 37.900000 96230.000000 1.388619 ... 0.125477 3.056440 0.126201 3.056828 1.207782 0.014976 1.207828 0.015015 90.917218 0.294593
2003-06-15 02:09:00 2003 6 89.830000 -3.000000 -2.400000 5.600000 24.700000 37.100000 96240.000000 1.388619 ... 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN NaN
2003-06-15 02:12:00 2003 6 90.980000 -3.700000 -2.200000 4.400000 24.600000 36.200000 96240.000000 1.388619 ... 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN NaN
2003-06-15 10:42:00 2003 6 91.350000 -4.600000 0.200000 1.700000 18.800000 56.900000 96400.000000 1.385856 ... 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN NaN
2003-06-15 10:45:00 2003 6 90.910000 -4.400000 0.200000 1.800000 18.700000 57.500000 96400.000000 1.385856 ... 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN NaN
2003-06-15 10:48:00 2003 6 89.780000 -4.300000 0.200000 2.100000 18.700000 58.000000 96400.000000 1.385856 ... 0.224854 3.864784 0.158889 3.865471 0.803959 0.004322 0.803978 0.044762 56.504668 0.289215
2003-06-15 10:51:00 2003 6 89.440000 -4.000000 0.300000 2.600000 18.600000 58.300000 96400.000000 1.385856 ... 1.642158 6.241162 0.525219 6.247033 2.392826 0.058070 2.393475 0.473012 59.804943 1.141530
2003-06-15 10:54:00 2003 6 89.070000 -3.600000 3.900000 3.200000 18.600000 57.900000 96400.000000 1.385856 ... 4.012569 7.673009 1.420514 7.698911 4.467645 0.408862 4.475100 1.447206 67.407828 2.676361
2003-06-15 10:57:00 2003 6 88.680000 -3.000000 12.300000 3.600000 18.600000 57.900000 96390.000000 1.385856 ... 7.097893 9.137867 3.239919 9.220435 7.051770 1.776650 7.097047 2.979300 78.293102 4.974555
2003-06-15 11:00:00 2003 6 88.280000 -2.500000 0.300000 3.800000 18.600000 58.100000 96390.000000 1.385856 ... 10.759475 10.857416 6.408121 11.068277 10.093805 5.398160 10.271433 5.016091 91.826425 8.037654
2003-06-15 11:03:00 2003 6 87.860000 -1.800000 0.300000 4.500000 18.700000 58.100000 96400.000000 1.385856 ... 14.902027 12.862035 11.284234 13.318716 13.473198 12.620026 13.983939 7.471013 107.470948 11.820442
2003-06-15 11:06:00 2003 6 87.430000 -0.800000 0.200000 5.500000 18.700000 58.000000 96400.000000 1.385856 ... 19.458490 15.124805 18.110776 15.997058 17.034624 24.380206 18.208826 10.247633 124.752303 16.255966
2003-06-15 14:48:00 2003 6 49.190000 651.600000 750.800000 165.800000 23.700000 51.700000 96460.000000 1.421208 ... 619.018137 124.596785 749.106130 616.776637 109.895007 797.364867 633.781952 115.118036 793.483012 636.454514
2003-06-15 14:51:00 2003 6 48.660000 663.700000 761.600000 165.200000 23.900000 50.900000 96460.000000 1.421208 ... 626.937537 125.282920 752.208616 624.805866 110.734491 799.379724 641.582591 115.565365 796.138940 644.261341
2003-06-15 14:54:00 2003 6 48.120000 673.200000 766.500000 165.300000 23.800000 50.900000 96460.000000 1.421208 ... 634.792910 125.959238 755.230226 632.764537 111.565942 801.339763 649.313538 116.003876 798.729813 652.000037
2003-06-15 17:03:00 2003 6 26.910000 950.700000 799.900000 221.700000 27.600000 33.500000 96430.000000 1.446498 ... 886.301440 146.139788 830.509841 887.995393 139.527627 845.891759 895.123181 130.478527 859.667476 898.379285
2003-06-15 17:06:00 2003 6 26.520000 949.100000 794.600000 221.000000 27.400000 35.700000 96430.000000 1.446498 ... 889.854449 146.404127 831.322409 891.525132 139.889378 846.420776 898.543172 130.627706 860.409926 901.820088
2003-06-15 17:09:00 2003 6 26.130000 954.100000 784.700000 233.100000 27.400000 36.400000 96430.000000 1.446498 ... 893.284742 146.658977 832.102133 894.932068 140.238422 846.928444 901.844166 130.771269 861.122843 905.141417
2003-06-15 19:42:00 2003 6 25.400000 1003.600000 830.800000 233.600000 29.000000 30.600000 96350.000000 1.461335 ... 892.301203 146.835130 833.010282 898.177642 142.775729 842.369869 902.560220 133.784554 855.604389 905.506061
2003-06-15 19:45:00 2003 6 25.770000 1000.800000 827.600000 236.300000 28.700000 30.500000 96350.000000 1.461335 ... 888.994761 146.589392 832.261927 894.888162 142.436030 841.871616 899.375011 133.643464 854.909759 902.305229
2003-06-15 19:48:00 2003 6 26.150000 994.500000 829.300000 232.000000 28.900000 30.600000 96350.000000 1.461335 ... 885.565120 146.334151 831.481097 891.475229 142.083455 841.351779 896.070256 133.496656 854.185464 898.984507
2003-06-15 20:15:00 2003 6 29.910000 928.500000 805.300000 216.800000 29.800000 29.200000 96360.000000 1.461335 ... 849.283874 143.619102 822.899407 855.305951 138.344139 835.640411 861.050082 131.916818 846.272416 863.817879
2003-06-15 20:18:00 2003 6 30.360000 928.800000 808.100000 216.000000 29.500000 29.400000 96350.000000 1.461335 ... 844.674438 143.262617 821.795483 850.713252 137.862457 834.905883 856.599307 131.708629 845.240607 859.342226
2003-06-15 20:21:00 2003 6 30.810000 930.800000 810.000000 218.800000 29.200000 29.100000 96350.000000 1.461335 ... 839.945458 142.903863 820.628538 845.990632 137.372205 834.129441 852.026091 131.497161 844.167722 854.751506
2003-06-15 20:24:00 2003 6 31.270000 914.000000 803.700000 213.300000 29.300000 28.800000 96350.000000 1.461335 ... 835.103861 142.535770 819.423271 841.153686 136.869822 833.327515 847.342116 131.279604 843.060664 850.050116
2003-06-15 20:27:00 2003 6 31.740000 920.200000 805.300000 219.100000 29.400000 29.200000 96350.000000 1.461335 ... 830.150818 142.158361 818.179083 836.203444 136.355394 832.499700 842.548371 131.055923 841.918977 845.239087
2003-06-15 20:30:00 2003 6 32.210000 932.200000 803.900000 233.800000 29.100000 28.800000 96350.000000 1.461335 ... 825.087543 141.771657 816.895355 831.140976 135.829012 831.645574 837.645885 130.826082 840.742191 840.319492
2003-06-15 20:33:00 2003 6 32.680000 934.400000 799.000000 245.500000 29.500000 28.300000 96350.000000 1.461335 ... 819.914761 141.375641 815.571311 825.966862 135.290717 830.764613 832.635208 130.590017 839.529696 835.291925
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2012-12-31 18:36:00 2012 12 66.733333 424.333333 606.533333 206.866667 -10.266667 81.366667 96540.000000 1.242034 ... 417.438749 65.196159 833.112480 394.281629 43.630944 943.123964 416.171746 58.661245 898.031255 413.390101
2012-12-31 19:00:00 2012 12 67.083333 400.800000 829.600000 106.900000 -10.433333 80.033333 96520.000000 1.242034 ... 410.416557 64.790061 828.891693 387.413023 43.165616 941.008582 409.426901 58.256298 894.428568 406.387642
2012-12-31 19:03:00 2012 12 67.160000 400.366667 834.033333 105.633333 -10.433333 79.933333 96516.666667 1.242034 ... 408.922719 64.703992 827.977281 385.951395 43.067378 940.549320 407.991618 58.169543 893.650690 404.897593
2012-12-31 19:06:00 2012 12 67.240000 401.900000 842.633333 105.066667 -10.566667 79.966667 96510.000000 1.242034 ... 407.294612 64.608886 826.983617 384.359686 42.960098 940.049849 406.427557 58.073933 892.802383 403.273341
2012-12-31 19:09:00 2012 12 67.330000 384.466667 801.566667 103.200000 -10.600000 80.100000 96503.333333 1.242034 ... 405.530761 64.506051 825.898841 382.634999 42.844251 939.504101 404.732865 57.970085 891.877685 401.513475
2012-12-31 19:12:00 2012 12 67.426667 386.300000 815.200000 101.500000 -10.700000 80.400000 96500.000000 1.242034 ... 403.630966 64.396864 824.711569 380.775704 42.720424 938.906204 402.906928 57.858672 890.871146 399.617853
2012-12-31 19:15:00 2012 12 67.530000 386.800000 831.433333 98.200000 -10.800000 80.666667 96490.000000 1.242034 ... 401.599058 64.277338 823.450980 378.789853 42.587390 938.270716 400.954554 57.737449 889.795697 397.589918
2012-12-31 19:18:00 2012 12 67.640000 383.000000 829.733333 96.333333 -10.800000 80.233333 96490.000000 1.242034 ... 399.429378 64.154133 822.063738 376.664752 42.447482 937.570572 398.868333 57.609890 888.626404 395.424509
2012-12-31 19:21:00 2012 12 67.760000 386.766667 847.500000 95.500000 -10.933333 80.800000 96480.000000 1.242034 ... 397.130283 64.019321 820.610385 374.417153 42.298196 936.836146 396.658558 57.471654 887.390699 393.129246
2012-12-31 19:24:00 2012 12 67.880000 389.900000 863.133333 95.133333 -10.933333 80.666667 96490.000000 1.242034 ... 394.691007 63.884862 818.995175 372.023732 42.143719 936.018816 394.311423 57.328992 886.044129 390.694311
2012-12-31 19:27:00 2012 12 68.010000 387.333333 866.633333 94.200000 -10.900000 80.333333 96493.333333 1.242034 ... 392.121892 63.740142 817.298999 369.506005 41.980608 935.159235 391.839866 57.176111 884.623218 388.129071
2012-12-31 19:30:00 2012 12 68.146667 383.600000 869.600000 92.166667 -10.833333 80.066667 96493.333333 1.242034 ... 389.422437 63.586514 815.509374 366.862050 41.809466 934.250870 389.242965 57.013615 883.121664 385.433063
2012-12-31 19:33:00 2012 12 68.290000 380.233333 873.266667 89.766667 -10.866667 80.566667 96490.000000 1.242034 ... 386.593794 63.424026 813.624189 364.093004 41.630499 933.292377 386.521776 56.841401 881.537998 382.607355
2012-12-31 19:36:00 2012 12 68.440000 376.700000 874.766667 87.900000 -10.800000 80.633333 96486.666667 1.242034 ... 383.634913 63.253997 811.630045 361.196436 41.444318 932.276663 383.674859 56.660003 879.865286 379.650926
2012-12-31 19:39:00 2012 12 68.596667 373.533333 873.033333 87.266667 -10.833333 80.800000 96480.000000 1.242034 ... 380.548401 63.075154 809.535137 358.176317 41.250692 931.207564 380.705034 56.468614 878.106717 376.566181
2012-12-31 19:42:00 2012 12 68.760000 369.100000 867.566667 87.100000 -10.866667 81.033333 96480.000000 1.242034 ... 377.331324 62.891502 807.304515 355.025636 41.051286 930.066861 377.608352 56.269142 876.245624 373.350439
2012-12-31 19:45:00 2012 12 68.930000 366.333333 868.300000 86.933333 -10.833333 80.700000 96480.000000 1.242034 ... 373.986652 62.700386 804.955929 351.750124 40.845363 928.863178 374.388341 56.060006 874.289216 370.006343
2012-12-31 19:48:00 2012 12 69.110000 361.333333 862.633333 86.633333 -10.800000 81.033333 96480.000000 1.242034 ... 370.515522 62.501841 802.486124 348.350934 40.633190 927.594381 371.046043 55.841038 872.235236 366.534931
2012-12-31 19:51:00 2012 12 69.290000 358.500000 863.400000 86.333333 -10.866667 80.933333 96480.000000 1.242034 ... 366.919242 62.295910 799.891723 344.829395 40.415061 926.258226 367.582663 55.612064 870.081360 362.937403
2012-12-31 19:54:00 2012 12 69.476667 349.500000 845.600000 86.033333 -10.900000 80.833333 96486.666667 1.242034 ... 363.196112 62.085256 797.146799 341.181214 40.192295 924.840744 363.995858 55.374200 867.814881 359.212194
2012-12-31 19:57:00 2012 12 69.670000 341.200000 825.666667 86.800000 -11.000000 81.333333 96493.333333 1.242034 ... 359.349794 61.867252 794.269006 337.412760 39.964194 923.350383 360.289779 55.125879 865.442836 355.362613
2012-12-31 20:33:00 2012 12 72.470000 273.000000 698.433333 92.900000 -11.133333 80.133333 96536.666667 1.242034 ... 304.015041 58.665642 747.552053 283.286958 36.945575 898.479999 306.916018 51.217295 827.460772 299.848758
2012-12-31 20:36:00 2012 12 72.740000 265.433333 687.300000 90.833333 -11.166667 80.066667 96530.000000 1.242034 ... 298.683398 58.346970 742.466071 278.086042 36.683968 895.687826 301.769147 50.801785 823.374960 294.485977
2012-12-31 20:39:00 2012 12 73.020000 256.166667 651.833333 93.533333 -11.300000 80.333333 96530.000000 1.242034 ... 293.246095 58.023475 737.128065 272.782319 36.425408 892.737778 296.519184 50.371811 819.108446 289.014670
2012-12-31 21:24:00 2012 12 77.750000 147.633333 519.800000 63.666667 -12.300000 81.466667 96520.000000 1.262943 ... 200.405729 52.154306 620.136560 183.156976 33.798079 818.434906 206.687776 42.396850 723.068793 195.141818
2012-12-31 21:27:00 2012 12 78.103333 156.600000 590.300000 63.200000 -12.400000 81.733333 96520.000000 1.262943 ... 193.598305 51.680576 609.002436 176.665182 33.718072 810.890789 200.132588 41.578186 714.594117 188.231086
2012-12-31 21:30:00 2012 12 78.460000 165.400000 665.266667 63.600000 -12.500000 81.966667 96520.000000 1.262943 ... 186.728752 51.192998 597.304746 170.134462 33.663102 802.810506 193.523462 40.723802 705.729819 181.253662
2012-12-31 21:36:00 2012 12 79.183333 147.833333 636.066667 57.066667 -12.533333 81.900000 96510.000000 1.262943 ... 172.819059 50.167983 572.119137 156.984265 33.633812 784.849248 180.163444 38.898053 686.758428 167.115032
2012-12-31 21:39:00 2012 12 79.553333 141.800000 630.266667 55.266667 -12.600000 82.166667 96510.000000 1.262943 ... 165.785242 49.631783 558.514417 150.375083 33.668010 774.809605 173.421830 37.923360 676.588833 159.961536
2012-12-31 21:42:00 2012 12 79.923333 135.800000 617.733333 54.233333 -12.600000 82.000000 96510.000000 1.262943 ... 158.704554 49.076378 544.193449 143.754268 33.737001 763.971768 166.647101 36.903589 665.933744 152.758285

289566 rows × 43 columns

Use of Measured Aerosol Optical Depth and Precipitable Water to Model Clear Sky Irradiance by Mark A. Mikofski, Clifford W. Hansen, William F. Holmgren and Gregory M. Kimball is licensed under a Creative Commons Attribution 4.0 International License.